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Abstract 

We study various transport-information ineqnalities under three different 
notions of Ricci curvature in the discrete setting: the curvature-dimension con¬ 
dition of Bakry and Emery [3] , the exponential curvature-dimension condition 
of Bauer et al. [5] and the coarse Ricci curvature of Ollivier [M]. We prove 
that under a curvature-dimension condition or coarse Ricci curvature condi¬ 
tion, an Li transport-information inequality holds; while under an exponential 
curvature-dimension condition, some weak-transport information inequalities 
hold. As an application, we establish a Bonnet-Meyer’s theorem under the 
curvature-dimension condition CD(k, oo) of Bakry and Emery [3]. 


1 Introduction 

In the analysis of the geometry of Riemannian manifolds, Ricci curvature plays 
an important role. In particular, Ricci curvature lower bounds immediately yield 
powerful functional inequalities, such as the logarithmic Sobolev inequality, which 
in turn implies transport-entropy and transport-information inequalities. Each of 
these inequalities has its own interest and has various applications, such as con¬ 
centration bounds and estimates on the speed of convergence to equilibrium for 
Markov chains. We refer the reader to Him as] for more about the links between 
curvature and functional inequalities, and to mm for applications of functional 
inequalities. 

However, when the space we consider is a graph, those theories are not as clear 
as in the continuous settings. The first question one would want to answer is how 
to define Ricci curvature lower bounds in discrete settings. The natural approach 
would be to define it as a discrete analogue of a definition valid in the continuous 
setting. There are several equivalent definitions one can try to use (see |2] for 
those definitions in the continuous settings and for the equivalences between them). 
However, in discrete spaces, we lose the chain rule, and these definitions are no 
longer equivalent. 

Several notions of curvature have been proposed in the last few years. Here 
we shall consider three of them: the curvature-dimension condition of Bakry and 
Emery H ) the exponential curvature-dimension condition of Bauer et al. [5] and the 
coarse Ricci curvature of Ollivier [HJ- Other notions that have been developed (and 
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which we shall not discuss further here) include the entropic Ricci curvature defined 
by Erbar and Maas in |13) and Mielke in |31) . which is based on the Lott-Sturm- 
Villani definition of curvature 123 HQ], geodesic convexity along interpolations in 
m and |25) . rough curvature bounds in [8]. It is still an open problem to compare 
these various notions of curvature. We refer readers to the forthcoming survey m 
for a more general introduction. 

The aim of this work is to obtain functional inequalities under the above three 
notions of curvature conditions and give some applications. 

Let us begin with setting the framework of Markov chains on discrete spaces: 

Markov chain on graphs and curvature condition 

Let Ai be a finite (or countably infinite) discrete space and K be an irreducible 
Markov kernel on X. Assume that for any x E A, we have 

'^K{x,y) = 1. ( 1 ) 

y 

This condition is a normalization of the time scale, enforcing that jump attempts 
occur at rate 1. We also define J{x) := 1 — K{x,x) and J := sup 3 ,g;^;■ J{x). J is a 
measure of the laziness of the chain, estimating how often jump attempts end with 
the particle not moving. Since we assume the kernel is irreducible, 0 < J ^ 1. 

We shall always assume there exists a reversible invariant probability measure vr, 
satisfying the detailed balance relation 

K{x, y)'K{x) = K{y, x)Ti{y) Vx, y E A. 

We denote by L the generator of the continuous-time Markov chain associated to 
the kernel K, which is given by 

Lf{x) = ^(/(y) - f{x))K{x,y). 
y 

Let Pt = be the associated semigroup, acting on functions, and its adjoint, 
acting on measures. We also define the T operator, given by 

r(/, y)(a:) := (/(y) - /(x))(y(y) - g{x))K{x, y) 

y 

and write r(/) := r(/, /). 

With this T operator, we are able to introduce the Bakry-Emery curvature 
condition CD(k, oo): 

Definition 1.1. We define the iterated T operator r 2 as 

T2{f) = hT{f)-T{f,Lf). 

We say that the curvature eondition CD[k, oo) is satisfied if, for all functions f, 
we have 

r2(/) ^ KT{f). 
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Since we shall deal with three different types of Ricci curvature lower bounds, 
in order to avoid confusions, we always denote n for the Bakry-Emery curvature 
condition. Kg for the exponential curvature dimension condition and Kc for a lower 
bound on the Coarse Ricci curvature. Throughout the paper, K,Ke,K,c will always 
be positive numbers. 

The r 2 operator and the curvature condition were first introduced in |3], and 
used to prove functional inequalities, such as logarithmic Sobolev inequalities and 
Poincare inequalities, for measures on Riemannian spaces satisfying CD(k, oo) for 
K > 0. In the Riemannian setting, the r 2 operator involves the Ricci tensor of 
the manifold, and the condition CD(k, oo) is equivalent to asking for lower bounds 
on the Ricci curvature, and more generally to the Lott-Sturm-Villani definition of 
lower bounds on Ricci curvature (see m and @01 for the definition, and j2] for the 
equivalence between the two notions). Hence CD(k, oo) can be used as a definition 
of lower bounds on the Ricci curvature for nonsmooth spaces, and even discrete 
spaces. This was the starting point of a very fruitful direction of research on the 
links between curvature and functional inequalities. In most cases, the focus was on 
the continuous setting, and the operator L was assumed to be a diffusion operator. 

In the discrete setting, this curvature condition was first studied by Schmuck- 
enshlager in |38) and then by S.-T. Yau and his collaborators in la El El EB]. It has 
also been used in |23) . where a discrete version of Buser’s inequality was obtained, 
as well as curvature bounds for various graphs, such as abelian Cayley graphs and 
slices of the hypercube. Note that most of these works are set in the framework of 
graphs rather than Markov chains, which generally makes our definitions and theirs 
differ by a normalization constant, since we enforce the condition (@1). 

As we have mentioned, the main differences between the continuous and discrete 
settings is the when the operator L is not a diffusion operator, we lose the chain 
rule. This leads to additional difficulties, and some results, such as certain forms 
of the logarithmic Sobolev inequality, do not seem to hold anymore. On the other 
hand, one of the main difficulties in the continuous setting is to exhibit an algebra 
of smooth functions satisfying certain conditions, while this property immediately 
holds in the discrete setting. 

The key chain rule used in the continuous setting is the identity 

L(ci>(/)) = <i>'(/)L/ + $"(/)r(/) 

which characterizes diffusion operators in the continuous setting, and does not hold 
in discrete settings. However, a key observation of @] is that when <h(x) = ^/x, the 
identity 

2^fL^f = Lf-2T{^f) 

holds, even in the discrete setting. This observation motivated the introduction of 
a modified version of the curvature-dimension condition, designed to exploit this 
identity: 

Definition 1.2. We define the modified r 2 operator r 2 as 

f2(/,/) :=r2(/)-r(^/,M) . 

We say that the exponential curvature condition CDE’(Ke, oo) is satisfied if, for all 
nonnegative functions f and all x £ X, we have 

i'2{f){x) KeT{f){x). 
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Remark 1.1. We use the notation CDE’(Ke, oo) to agree with the notations oflM, 
where they also consider the case when the condition is only satisfied at points x 
where Lf{x) < 0. 

In it is shown that CDE’{Ke, oo) implies CD{k, oo) with k = Kg. When the 
operator L is a diffusion, the conditions CDE’{Ke, oo) and CD(k, oo) are equivalent. 

Under this notion of cnrvatnre, Baner et al. prove in |5] varions Li-Yan ineqnal- 
ities on graphs, and then dednce heat kernel estimates and a Bnser ineqnality for 
graphs. In |6], it was shown that the CDE’(Ke, oo) condition tensorizes, and that 
the associated heat kernel satisfies some Ganssian bonnds. 

The third notion of cnrvatnre we shall now introdnce is the coarse Ricci cnrva¬ 
tnre. In order to define it, we first need to introdnce Wasserstein distances. 

Let d be a distance on W. The Wasserstein distance is defined as following: 

Definition 1.3 (L^-Wasserstein distances). Let p 1. The LT-Wasserstein dis¬ 
tance Wp between two probability measures pL and u on a metric space (Y, d) is 
defined as 


i/p 



where the infimum runs over all couplings tt of p. and v. 

Finally, we recall the definition of Coarse Ricci cnrvatnre, which has been in- 
trodnced by Ollivier in |34) for discrete-time Markov chains. Since we shall work in 
continnons time, we shall give the appropriate variant, introdnced in |21) . Previons 
works considering contraction rates in transport distance inclnde HB [371133]. Ap¬ 
plications to error estimates for Markov Chain Monte Carlo were stndied in |22] . 
The continnons-time version we nse here was introdnced in |21j . The particnlar 
case of cnrvatnre on graphs has been stndied in [20]. 

Definition 1.4 (Coarse Ricci cnrvatnre). The coarse Ricci curvature of the Markov 
chain is said to be bounded from below by Kc if, for all probability measures p and 
V and any time t 0, we have 


Wi{Pf p,Pfv) ^ ex.-p{-Kct)Wi{p,v) 


i.e. if it is a contraction in Wi distance, with rate Kc- 

Note that nnlike the CD(k;, oo) condition, this property does not only depend 
on the Markov chain, bnt also on the choice of the distance d. 

In this work, there are two distances we shall be interested in. In the rest of the 
paper, we write dg for the graph distance associated to the Markov kernel. If we 
consider W as the set of vertices of a graph, with edges between all pairs of vertices 
{x,y) snch that K{x,y) > 0, dg shall be the nsnal graph distance. More formally, 
it is defined as 

dg{x,y) := inf{n E N;3xo, ■■,Xn\xo = x, Xn = y, K{xi,Xi+i) >0 VO ^ i ^ n - 1}. 
In section [3l we shall also consider the distance dr, defined by 


dr{x,y) = sup f{x)-f{y). 
/;r(/)ssi 
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The main property that makes dr interesting is that the 1-lipschitz functions for 
dr are automatically characterized as the functions / such that r(/) ^ 1. In this 
case, we denote Wp^dr instead of Wp for the Wasserstein distance of space {X,dr)- 
One reason to consider this distance is that dr is the exact analog of the classical 
situation for continuous metric space, where r(/) = |V/p. On the other hand, 
observe that r(d(x, •)) ^ holds for all x. It follows that 

dg{x, •) ^ \l^^dr{x ,.). ( 2 ) 

Thus for all x,y £ X, 

dg{x,y) ^ ^^^]dr{x,y). (3) 


Therefore the estimates on dr are stronger than estimates on dg with a constant \ ^ 


which is bounded by Of course, this also means that functional inequalities 

involving d shall be easier to obtain, at least in some situations. 


Functional inequalities 

Now we turn to functional inequalities on graphs: 

Definition 1.5. (Fisher information) Let f be a nonnegative function defined on 
X. Define the Fisher information of f with respect to tt as 

d^Trif) .= 4: f T{^/f)d^T = 2 ^ - V fix)fd<:{x,y)Tr{x). 

d xGcX y&X 

The factor 4 in this definition comes from the analogy with the continuous setting, 
where 

4 J |Vv7Pdvr = J |Vlog/|2/dd- 

In the continuous setting, the Fisher information can be written as f V log / • V fdn, 
so we can define a modified Fisher information as 


inif) ■■= j r(/,log/)d7r. 


(4) 


which corresponds to the entropy production functional of the Markov chain. 

There is a third way to rewrite the Fisher information for the continuous settings 
as f dy, and one can also define another modified Fisher information as 


lAf) ■■= 


r(/) 

/ 


dvr. 


Of course, there are many other ways to re-write the Fisher information in the 
continuous setting, each leading to a different definition in the discrete setting. We 
only stated here the three versions we shall use in this work. 

In the discrete setting, In{f) and Inif) are not equal in general. It is 

easy to see that TAf) ^ and TAf) ^ ^wif) If / is the density function of a 
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probability measure v with respect to vr, since ^ f{x) + /(y), 

and since vr is reversible, one can deduce that 

^n{f) ^ 2 ^ ^ (/(x) + f{y))K{x,y)7r{x) < 4J. 

Here we can see that in discrete settings, the Fisher information is in fact bounded 
from above, which is not true in continuous settings. 

Let us recall the definition of the relative entropy Entjr as well: 

Definition 1.6. (Relative entropy) Assuming that f is a nonnegative function on 
X, we define the relative entropy f with respect to vr as following: 

EnW(/) := f{x) log /(x)7r(x) - Y /(a^)^(^) log ( 

X V A" 

Note that when f is a probability density with respect to tt, the second term takes 
value 0. 

Definition 1.7. Let n be a probability measure on X and p 1. We say that vr 
satisfies 

(i) the logarithmic Sobolev inequality with constant C, if for all nonnegative 
functions f, we have 

EnW(/) ^ (LSIiC))- 

(a) the modified logarithmic Sobolev inequality with constant C, which we shall 
write mLSI{C), if for all nonnegative functions f, we have 

EnW(/) ^ ^ j r(/,log/)d7r {mLSI{C)); 

(in) the transport-entropy inequality Tp{C) if for all probability measures v = 
fir, we have 

{TpiC)y, 

(iv) the transport-information inequality TpI{C) if for all probability measures 
V = fir, we have 

(TpIiC)). 

In the continuous setting, mLSI and LSI are the same inequality, but in the 
discrete setting they correspond to distinct properties of the Markov chain, namely 
hypercontractivity for LSI and exponential convergence to equilibrium in relative 
entropy for mLSI. In general, LSI implies mLSI, but the converse is not true. 
We refer to [7] for more on the difference between the two inequalities. 

In the discrete setting, when p = 1, the following relations between the inequal¬ 
ities hold, in the same way as in the continuous setting: 

LSI(C)^ri/(C)^Ti(C). 

The Ti inequality is equivalent to Gaussian concentration for vr (see for example |24) 
and the next section), but not dimension-free concentration, and is therefore strictly 
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weaker than T 2 . When p = 1, the transport-information inequality is equivalent to 
Gaussian concentration for the occupation measure of the Markov chain (see |18)1. 
and is therefore useful to get a priori bounds on the statistical error for Markov 
Chain Monte Carlo estimation of averages. 

One of the most interesting cases in the continuous setting is the transport- 
entropy inequality when p = 2, which is also called the Talagrand inequality, which 
was introduced in mi. It is equivalent to dimension-free Gaussian concentration 
for TT. 

One can show the following relationships: 

CD{k,oo) => LSI{k) => T2I{k) => T2{k). 

We refer to |36) and |19) for the proofs of these implications. With those inequalities 
in hand, one can prove some dimension-free concentration results on a metric- 
measure space (see M)- 

However, the results for p = 2 fail to be true in discrete settings, as we will 
see in next section. When X is a graph, vr never satisfies T 2 , unless it is a Dirac 
measure (see for example m, or Section To recover a discrete version of T 2 , 
we therefore have to redefine the transport cost. Erbar and Maas recovered some 
of those functional inequality results with the notion of entropic Ricci curvature on 
graphs, and we refer the reader to |131 129] for more details. Another way to deal 
with it is to take the weak transport cost introduced by Marton m- 

Definition 1.8. Let {X,d) be a polish space and two probabilities measures on 
X, define 

■^(i/l/r) := ^^inf |y d{x,y)px{dy)^ p{dx) 

Where n(/r, u) is the set of all couplings vr whose first marginal is y and second 
marginal is v, px is the probability kernel such that n ^ dxdy ) = Px { dy ) y { dx ). Using 
probabilistic notations, on has 

f 2 {u\y)= ini E{{E{d{X,Y)\Xf)). 

Note that the weak transport cost could be also seen as a weak Wasserstein- 
like distance. In order to agree with the notations of Wassertein distance, we note 
W 2 {n\y)‘^ := However, it is not a distance, since it is not symmetric. Note 

that by Jensen’s inequality, both W 2 {r'\y) and W 2 {y,n) are larger than 

Definition 1.9. Adapting the settings of Dehnition M. 7l we say that tt satisfies 

(v) the weak transport-entropy inequality T^(C) if for all probability measures 
V = /tt, we have 

W2{f7r\7rf ^ ^Ent^if); 

(vi) the weak transport-entropy inequality Tfi (C) if for all probability measures v = 
/tt, we have 

W2{'K\f'Kf < |;Ent^(/); 

(vii) the weak transport-information inequality T~^l 2 {C) if for all probability mea¬ 
sures n = fn, we have 

W2{fTT\TTf ^ -^Ynif)- 
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(via) the weak transport-information inequality T I 2 {C) if for all probability mea¬ 
sures V = f'K, we have 

W2{7T\f7rf ^ 

Here we only consider the case when the cost function is quadratic, for more 
general result about weak transport inequalities, we refer to |16) and [39]. 

Our main results are the following theorems: 

Theorem 1.10. Let (T,dr) be a connected graph equipped with T-distance dr- Let 
K be an irreducible Markov kernel on X and tt the reversible invariant probability 
measure associated to K. Assume that CD{k, oo) holds with k > 0. Then vr satisfies 
the transport-information inequality Til with eonstant k. More precisely, for all 
probability measure v := /vr on X, it holds 

W"i,dr(/7r,vr)^ ^ ^TM)- 

With such a result in hand, we can then follow the work by Guillin, Leonard, 
Wang and Wu |19) . to prove a transport-entropy inequality Ti holds, so that the 
Gaussian concentration property follows as well. Another application is that after 
a simple computation, one can obtain the following Bonnet-Meyer type theorem: 

Corollary 1.11. Assume that CD{k,oo) holds, then 

dg{x, y)K if 2 mm{v^J(a:), ^/J{y)} + \/J{y)^ - 

Recall that under coarse Ricci curvature condition, Ollivier has proved in |34] 
the same type inequality: Kcd{x, y) ^ J{x) -\- J{y). Here we get a diameter estimate 
with a better order but we lose a constant 2 order under the condition CD{k,oo). 

Now if we make the stronger assumption CDE'{Ke,oo), we get a stronger in¬ 
equality: 

Theorem 1.12. Let {X,d) be a eonneeted graph equipped with graph distance d. Let 
K he a irredueible Markov kernel on X and tt the reversible invariant probability 
measure associated to K. Assume that CDE'{Ke,oo) holds with Kg > 0. Then 
IT satisfies the transport-information inequality Ti^I with constant More 

precisely, for all probability measure v := fn on X, it holds 

_ __ 2 / 2 

i^2(/vr|7r)2 ^ —IM) ^ —Mf)- 

Kg Kg 

Again, following the ideas of |19) . one can prove a weak-transport entropy in- 
equality T^H. On the other hand, sine the weak-transport cost is stronger than 
the L^-Wasserstein distance, it yields immediately TiJ holds, which implies TiH 
and concentration results. 

Under coarse Ricci curvature condition, the inequality TiI{kc) holds 

Theorem 1.13. Let X,tt,K define as before. If the global coarse Ricci curvature 
is bounded from below by Kc > 0, then the following transport inequality holds for 
all density function f: 

Wi(/7r,7r)2 ^ \x^[f) (j-ll^{f)) ^ \x^{f). 

Kc \ O J 
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As a corollary, this last result implies a Ti inequality for such Markov chain, 
which has been previously obtained by Eldan, Lehec and Lee |12) . 

The paper is organized as follows: in first section we will explain why T 2 H 
can not hold in general, then establish connection of results in |39) and the Fisher 
information on graph settings. Section 2 gives a few preliminary results about 
Hamilton-Jacobi equations on graphs. In the third, fourth and fifth sections we 
will discuss functional inequalities under CD{k,oo), CDE'{Kf>,oo) and coarse Ricci 
curvature Kc respectively. In the last section we will show some applications, such 
as how a transport-information inequality implies a transport-entropy inequality, 
concentration results, a discrete analogue of the Bonnet-Meyer theorem, and a study 
of the example of the discrete hypercube. 


2 Preliminary 


In this section, we present some general results in the discrete setting, without 
assuming any curvature condition. Our main concern is to present the Hamilton- 
Jacobi equations on graphs introduced in |161 l39] and their relation with weak 
transport costs. 

As we mentioned in the introduction, in the discrete setting, when vr is not a 
Dirac mass, the inequality T2{k) cannot hold true, for any k > 0. To our knowledge, 
this was first proved in |15) . We give here a different proof, as a consequence of a 
more general result: 

Lemma 2.1. Let {X,d) he a metric space and /i a probability measure defined on 
A. Assume that there exist Ci,C 2 C A such that 
(z) infa; 6 Ci,t/eC 2 d{x,y) > 0 , 

(ii) supp(/r) C Cl U C 2 , 

{Hi) /i(Ci) > 0, fa{C2) > 0. 

Then p, does not satisfies T 2 {k) for any n > 0. 

Proof. For h < min{/r(Ci), ^(C 2 )}, define 

n(dx)'={ + J(CTT>' 

Let d := m.ix^Ci,y&C 2 d{x,y) > 0. Then we have W 2 {p,v)‘^ ^ d^/i, and the entropy 
is 

(m(Ci) + h) log(l + h/p{Ci)) + {p{C2) - h) log(l - h/p{C2)). 

When h goes to 0, the entropy is 0{h?‘). The conclusion follows since W 2 have 
order 0{h). □ 

Thus, if TT satisfies T 2 on graphs, it means that vr is a Dirac mass. 

In this section, we shall describe the links between weak transport inequalities 
and the Hamilton Jacobi operator that was introduced in |16) and studied in (39) . 

Following (16) . the weak transport cost between two probability measures 
p and u satisfies the following duality formula: 


W 2 {iy\p)'^ = sup <j / Qigdv - / gdp 


(5) 
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where the infimum-convolution operator is defined as 

Qt^{x) := | J g{y)p{dy) + ^ (^J d{x,y)p{dy) 

Later, the second author remarked (see [39]) that the operator Qt satisfies a 
discrete version of the Hamilton-Jacobi equation: for all t > 0 

+^ 0 ( 6 ) 

where \Vg\{x) := sup^g^y refer to jH |l3] for information about 

Hamilton-Jacobi equations in the continuous setting and their link with functional 
inequalities. 

Now let a G according to (|6|), one can easily check that for all t > 0, it 

holds 

The evolution with respect to time is controlled by this special "gradient". We 
refer readers to [39] for properties of Q and V. Here we shall develop some more: 

Proposition 2.2. (Convexity) Let g be a function defined on X , then for all x £ X, 
the function t e->■ Qtg{x) is convex. 

Proof. Fix X £ X, define G{t) := Qtg{x) Observe that for any A £ [0,1], and 
Pi-,P 2 £ 7’(<T), setting p := Api J- (1 — A)p 2 £ V{X) and applying the Cauchy- 
Schwarz inequality, it holds for all t, s > 0: 



d{x,z)p{dz)^ ~ J d{x, z)pi{dz) + {1 — X) J d{x,z)p 2 {dz) 

< (At -k (1 - A)s) ( d{x,z)pi{dz))^ -k ^)P 2 {dz)f 


( 8 ) 


As a consequence, we get 


a(/ g{z)pi{dz) + ^{J d{x,z)p{dz)f^+{l-X) ij g{z)p 2 {dz) + ^ (^J d{x,z)p 2 {dz) 


> / giz)pidz) + 


1 


(At -k (1 — A)s) 


d{x,z)p{dz)] ^ G(At-k (1 — A)s) (9) 


Taking the infimum over all pi,P 2 £ V{X) on left hand side of the inequality, the 
conclusion follows. □ 


The following lemma is a technical result connecting the gradient V and T- 
operator. 

Lemma 2.3. Let vr be the reversible probability measure for the Markov kernel k. 
For any bounded function f and g on X, the following inequalities hold: 

(i) /r(/,5)d7r^ V2J/|V<7||V/|d7r, 

(u) I f r(f, g)d^\ ^V^j\Vg\ v^dvr. 
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Moreover, if we suppose that f is non negative, then we have 
{in) fT{f,g)d7ri^2V^J\Vg\^/fT{^/f)d^^, 

(iv) / r(V7)d7r ^ ^ / |V log fl^fdTT. 

Proof. The proofs of these four inequalities all follow similar arguments. Denote 
the positive part and negative part of a function u as and u- respectively. 

(i): Using the relation (nn)+ ^ u^v+ + U-V-, we have 


J ^{f,9)+d7T = 


~ d<{x, y)7r{x) 

X J ^ 

X y'^x 

+ '^(9(y) - 9{x))-ifiy) - f{x))-K{x, y)7r{x) 


X y-^x 

Now by reversibility of the measure vr, it holds 


'^'^i9{y) - 9{x))+{f{y) - fix))+K{x,y)TT{x) 

X X 

= '^'^i9{y) - 9{x))-{f{y) - f{x))-K{x,y)TT{x) 

X y^x 

^ J 2 ^^ 9 \{x)'^{f{x) - f{y))-K{x,y) 7 r{x), 

X y^x 


Where the latter inequality follows from |V 5 f|(x) ^ {g{y) — 9{x))- for all y ~ x. 
Therefore, we get 

/ ^{f,9)+dTT < ^ |Vy| ^(/(y) - f{x))-K{x,y)-K{x). (10) 

X y^x 

In (Uni, using r(/,y) ^ r(/,y)+ and Y.y^xU{y) “ f{x))-K{x,y) ^ |V/|(x)J(x), 
we get (i). 

{ii): By the Cauchy-Schwarz inequality, it holds 

(^Y.^f{y)-f{x))-K{x,y)\ ^J{x)Y,ifiy)-fix))lK{x,y)^2JT{f) (11) 

Combining (fTUI) and m leads to 

J T{f,g)+d7r^ I \Vg\y^2JT{f)d7r. (12) 


Following a similar argument, we have 

I T{f,g).d7r^ I\Vg\y^2JT{f)d7r. (13) 


and {ii) follows by (1121) . (1131) and the inequality 


r(/,5')rf7r 


^ max 


r(/,y)+dvr, / T{f,g)_d7r 
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(iii): Since / is nonnegative, ^/f is well defined. Then it holds 


f ^if,g)dTT = - fix)){g{y) - g{x))K{x,y)TT{x) 

X yr^X 

= '^(.9{y) - 9{x)){\/f{y) - '/l{x)){\/l{y) + y/f{x))K{x, y)-n{x) 

X yr^x 

Now arguing as in (i) and (ii), by reversibility of vr, we get 

/ r(/,5)d7r = '^'^{g{y)-g{x))-{^/f{y)-^/f{x))-{^/f{y) + ^/f{x))K{x,y)7^{x). 

X y^x 

Notice that (V7(y) - y/7ix))-{yj{y) + y/Jix)) ^ {V7{y) - Vf{x))-2^/J{x), we 
have 

/r(/,5)c^vr ^ 2 ^^(c?( 2 /) - g{x))-{^/]{y) - ^/f{x))-^/f{x)K{x,y)^T{x) 

X y^x 

I \Vg\\ffT{^)d7r 

where the last step we have used (|lll) with u := y/J. 

(iv): If / is the null function, there is nothing to say. Otherwise, if there exist 
x,y G X such that f{x) = 0,f{y) > 0, it is easy to see that |Vlog/(7/)p/(?/)7r(y) = 
oo. So we only need to prove the case f{x) > 0 for all x G X. 

Since / is a positive function, one can rewrite / = e®, it is enough to prove that 

J r(e®/^)d7r ^ y iVgpe^dvr 

holds for all function g. In fact, by convexity of function x e^, we have for all 
a > b, {a — 6)e“ ^ e“ — e^. Thus 

^ {g{x) - g{y)fe^^^'^K{x,y)Tr{x) 

^^y,9{y)<g{x) 

= A I r(e5/2)d7r 

□ 

3 Transport inequalities for Markov chains satisfying 

CD(/t:, oo) 

In this section, we assume that the Markov chain satisfies the curvature condition 
CD(k, oo) for some k > 0. One of the main tools we shall use is the following 
sub-commutation relation between T and the semigroup Pt, which was obtained in 

m- 
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Lemma 3.1. Assume that CD{k, oo) holds. Then for any f : X —> M, we have 

TiPtf) ^ e-^^^PtT{f). 

Remark 3.1. This property implies that if f is 1-Lipschitz for dr, then Ptf is 
e~'^^-Lipschitz. Therefore, the condition CD{n,oo) implies that the coarse Ricci 
curvature of the Markov chain, using the distance dr, is bounded from below by k. 


3.1 L^-transport inequalities 

Here we shall prove two inequalities connected to the Wasserstein distance under 
CD(At, oo). 


Proof of Theorem \1.1(A The proof relies on the Kantorovitch-Rubinstein duality 
formula 




sup 

g 1-lip 



gdv. 


Let <7 be a 1-Lipschitz function for dr- This is equivalent to having r(g) ^ 1. 
First, using the Cauchy-Schwartz inequality, it holds 


T{Ptg,f)d'K = {Ptg{y) - Ptg{x)){f{y) - f{x))K{x,y)7T{x) 


^,y 


lYl \ iPt9iy) - Ptg{x))i^/f{y) - ^/fix))i^/fiy) + y^{x))\K{x,y)Tr{x) 


x,y 


< 


^(®)r(v7)(a;) 2 ( '^{Ptg{y) - Ptg{x)f{^{y) + y^{x)fK{x, y) | , 


Now applying the Cauchy-Schwartz inequality again, the latter quantity is less than 
{'^{Ptg{y) - Ptg{x)f{\/l{y) + \ff{x)fK{x,y)'K{x^ . 


. x,y 


Therefore, we have 

- [ ^{Ptg,f)d7r ^ 



'tt ) {Ptg{y) - Ptg{x)f{^/]{y) -h ^/]{x)fK{x, y^ix) 


. x,y 


^ V2^xYf)\ / T{Ptg)fdTr, 


where the last step we have used the reversibility of the measure vr and the fact 
that (\/7(y) + '/J{x)Y ^ 2(/(x) -|- f{y)) for any nonnegative function /. 
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Therefore, according to Lemma 13.11 we have 


J gdTT - j gfd-K = J ^ J {Ptg)fd-Kdt 


r+oo 


^{Ptg,f)d7rdt 






/ +00 

J T{Ptg)fd7rdt 

r+oo 


Pt(r{g))fdTTdt 


V2 


Kj 


The result immediately follows by taking the supremum over all 1-Lipschitz func¬ 
tions g. □ 


Using ([2|), we get the following corollary: 

Corollary 3.2. Assume that CD{n,oo) holds with k > 0. Then vr satisfies the 
transport-information inequality Til with constant k. More precisely, for all prob¬ 
ability measure v := /vr on X, it holds 


bUi,d3(/7r,7r 



Using, similar arguments, we can also prove the following Cheeger-type inequal¬ 
ity: 


Proposition 3.3. Assume that CD(k, oo) holds. Then for any probability density 
f with respect to vr, we have 


\/r(/)d7r. 


We call this a Cheeger-type inequality, by analogy with the classical Cheeger 
inequality 

ll/vr - vrIlTy ^ C J |V/|d7r. 

Here f ■^/TffJdTr is an estimate on the gradient of /, while both ITi^^p (/vr, vr) 
and II/tt — 7 r||'ry are distances of nature. 

Proof. Once more, by Kantorovitch duality for LUi^rfp, and since 1-Lipschitz func- 
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tions are exactly the functions g with r(( 7 ) ^ 1, we have 

= sup / gfd-ir- / gdir 
9;r(9)s£i J J 

p+co r 

sup - / / T{PtgJ)dTTdt 

Jo J 


g-,rig)^l Jo 

^+oo p 

^ sup / J y/T{Ptg)y/T{f)d7rdt 


9;r(s)s£i Jo 


9;r(s)s£i 

1 


ir-^i 


^ sup / e— / 




Remark 3.2. Again, by ([2]), with assumption CD{K,oo),we get 

< V^/ 


□ 


3.2 L^-transport inequalities 

Under condition CD(k, oo), we have not been able to obtain a transport-entropy 
inequality involving a weak transport cost. However, we can still obtain a bound 
on lU2,rfg(7r|/7r)^ with the Dirichlet energy. 

Proposition 3.4. Assume that CD{k, oo) holds. Then for any probability density 
f with respect to tt, we have 


11^2,dg(vr|/7r)^ ^ J T{f)dTr. 

Unlike the transport-information inequality, this inequality does not seem to 
imply a transport-entropy inequality, and does not seem to be directly related to 
concentration inequalities. 

Proof. First, for any bounded continuous function g on X, we have: 


J Qgfd-n 


J Qgdn = - j ^ J Pt{Qg)fdTrdt 

^- 1-00 P ^ r+oo P j -- 

J T{Pt{Qg)J)dndt = j ^TiPtiQg))y^)d7rdt (14) 


According to Lemma |3dl CD(«:, oo) implies that T{Pt{g)) ^ e Pt{T(g)) holds 

for all g. Hence, 


/ +00 p j -- p+oo p j -- 

J ^T{PtiQg))^/W)dndt ^ e-'^^ j ^PtT{iQg))^/T^)d7rdt (15) 
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On the other hand, by Lemma 12.21 and ([6|), it holds 


Qgdir — / gdir = 



Qtgdndt 


€ 



^Qt9\t=idTTdt ^ - ^ 


\ I IVQ5P 


diT (16) 


Now applying part (i) of Proposition 12.31 we get 

1 


1 

4 


\VQgfd7r ^ - 


1 


iV2J 


T{Qg)dTr 


Pt 


^ r+co 

T{Qg)dTr = / ( 

Jo 




4^2 J 

Combining (fT4]) . (fT^ . (fTHj) and (HZI, we have 


4^/2J 


PtT{Qg)dTTdt (17) 


J QsfdT^ ~ j QgdTT + / ( 


QgfdTT - I gd-JT = I QgfdTT - I Qgdir + / Qgdir — / gdir 

r+oo 








T{f)d7rdt 


V2J 


T{f)dn 


The conclusion follows by the duality fomula ([5]) while taking the supremum when g 
runs over all bounded continuous function on the left hand side of the last inequality. 

□ 


4 Transport inequalities for Markov chain satisfying 

CDE’( 6 :e, 00) 

In this section, we assume that the exponential curvature condition CDE’(Ke 5 C)o) 
holds. We will prove Theorem 11.121 But hrst, we shall study some properties of 
the CDE’(ACe,oo) condition. 


4.1 Properties of CDE’(Ke, cxd) 

Lemma 4.1. Assue that CDE’{k, 00 ) holds. Then for any nonnegative function 
f : X —)• M and any t ^ 0, we have 


{{} r(VTlf) < e-2-eip^r(V7). 


(ii) 


r(Pt/) 

Ptf 





Remark 4.1. (i) looks like the commutation formula ofT and yP under CD(k,oo) 
in continuous settings. But it is not the same thing, the positivity is very important. 
Recall that in classical Bakry-Emery theory, the commutation formula is the follow¬ 
ing: for all f (smooth enough), y^TiPtf) ^ e~^^Pt{-\/T{f)). We have not been able 
to recover this formula under CDE{Ke,oo) in graphs settings. 
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Proof. The proof follows a standard interpolation argument. We begin with (i). 
Let g := Pt-sf and define ^p{s) := (r(y^)). To obtain the result, it is 

enough to show that y?' ^ 0. In fact, 


ip’{s) = e 
^ 0 , 


— 2PieS 




i(r(v/9)) - r ( P 


2kT{^) 


where we have used the assumption on the curvature, which is equivalent to 

(see (3.11) in |5])- 

Similarly, let '4>{s) := Again, it is enough to show that ijj' ^ 0. 

We have 

V,'(s) = e-^^^Ps (P ^ {-2gT{g, Lg) + rig)Lg) - . (18) 

Since g is positive, we only need to show that 

g [l ^ {-2gT{g, Lg) + T{g)Lg) - 2«e^) ^ 0. (19) 

Notice that (|19p is equivalent to 

gL - 2T{g,Lg) + -T{g)Lg ^ 2KeT{g), 

\ 9 J 9 

and we conclude by writing 

2KT{g) ^ 2t2{9) = 2 (r 2 (< 7 ) - Tig, = gL - 2r{g,Lg) + -T{g)Lg. 

□ 


4.2 Weak transport-information inequalities under CDE’(k,cx)) 

Using Lemma [4.1[ we can prove some weak transport-information inequalities under 
CDE’(Ke,oo). First, we will prove Theorem II. 121 


Proof of Theorem M. 12[ Let a{t) = e for any probability density / with respect 
to vr and any t > 0, applying m, it holds: 


10 


dt 


j Qa{t)gPtfd'Kdt ^ J '^"^\'^Qa(t)9\^Ptf+ 'r{Qait)9,Ptf)dTTdt 

( 20 ) 


According to part (in) of Lemma 12.31 we have 


I Qsfd. - I si. = 1)^ f I 


Qa{t) 9 Ptfd 7 rdt 


1*00 o JpKet r 

^ / -/ r(v^)d7rdt. 

Jo J 


17 


















Now we apply Lemma 14.11 and we get 


J QgfdTT 



Pt 



dirdt 





The conclusion then follows from the duality formula ([5|) by taking ^ = n and 
u = /tt. □ 

It is easy to see that X-wif) ^ Pn{f) ■= f ~^dTT, thus we have the following 
corollary: 


Corollary 4.2. Assume that the exponential curvature condition CDE’{Ke,oo) 
holds, then tt satisfies the following weak-transport information inequalities: 

~ 2 J— 

W2{fn\nf < —lAf)- 

Kj 


Unfortunately, we have not been able to establish the relation of lU2(vr|/7r)^ and 
J T{y/y)dTT. However, as in Corollary 14.21 we get a weaker inequality as follows: 

Theorem 4.3. Assume that the exponential curvature condition CDE’{Ke,oo) 
holds, then vr satisfies the following weak-transport information inequalities: 

~ 9 r_ 

W2{7r\fnf ^ —lAf)- 

Proof. We prove this theorem in a similar way as the previous one. 

Let a(t) := 1 — e~'^^ Arguing as in the latter theorem, we get 


J Qgdn - J gfdir = j ^ J Q^^f^gPtfdTrdt 


€ 


ri 


4® 


— Ket 


NQait)9fPtf - r(Q„(i)5,Pt/)d7rdt 


Now applying part (ii) of Lemma 12.31 it follows that 


j QgdTT - J gfdTT ^ / 

roo 


4 




/ 


Ke. 


rjPtf) 

Ptf 


dTrdt ^ 


\^Qait)9\^Ptf + J |VQ«(i)5|\/2jr(Pt/)d7rdt 

oo 2Je-^-^ 


roc r 

Jo J 


P.{^)d.dt 


^ f nf) 

J f 


diT 


□ 


Remark 4.2. (z) One can get Corollary \4-.9\ by a similar argument: let a{t) := e 

J Qgfdn - j gdTT = ^ J Q^^t^gPtfdTidt 




+ v^|vg„(i)5lV|r(Pt/)|d7rdt 




roo r 

Jo l^e J 


rjPtf) 

Ptf 


dTrdt ^ ^ 

ni 


J [nn 

f 


dirdt. 
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(a) Using the notations ofUBi, define W2{f7r,7T)^ = ^(W2^(/7r|7r) + ty^(7r|/7r)), 
denote V 2 {X) as the set of the probability measure on X which has a finite seeond 
moment. Then {'P 2 {X), W 2 {., ■)) *-5 ® metric space, and if X satisfies the exponential 
curvature condition, we have an upper bound for W 2 {.,tt) in terms of modified Fisher 
information. Of course, when we work on a finite space, any probability measure 
has finite second moment. 


5 Transport-information inequality for Markov chains 
with positive coarse Ricci curvature 


In this section, we assume the Markov chain has coarse Ricci curvature bounded 
from below by Kc, with respect to the graph distance dg. 

As a consequence of the bound on the curvature, note that for any 1-Lipschitz 
function g, Ptg is e“'^‘'*-Lipschitz. 

The problem of proving a transport-entropy inequality for Markov chains with 
positive coarse Ricci curvature was raised in Problem J in |35) . It was proved 
by Eldan, Lee and Lehec |12] . The transport-information inequality is a slight 
improvement of this result. Note that Ti cannot hold in the full generality of the 
setting of |34) . since it implies Gaussian concentration, which does not hold for 
some examples with positive curvature, such as Poisson distributions on N. 

The proof of this result will make use of the following lemma: 

Lemma 5.1. If the coarse Ricci curvature is bounded from below by Kc > 0, then 
Wi{f7r,Tr) ^ — 

xjty 


Proof. By Kantorovitch duality for Wi, we have 

Wi{fTr,7r)= sup [ gfdTT- [ gdir = sup - / ^ / Ptgfdndt 

ol—lip J J ql—lip Jo J 


gl-lip 
r+oo 




- Y] (Ptfiy) - Ptgix))if{y) - f{x))K{x,y)TT{x)dt 
do 

r+oo 

/ WPtgWiipY \ f{y) - f{x)\K{x,y)TT{x)dt 
do 




\fix) - f{y)\K{x,y)TT{x). 


'x^y 


□ 


We can now prove Theorem II. 131 
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Proof of Theorem li.iM Observe that 
X] (^(^) + K{x,y)7T{x) 

x^y 

= X] ( 2 /( 3 ^) + 2/(y) - (a/7(x) - \ K{x, y)TT{x) 

^ + ‘^fiy))K{x, y)7r{x) - - Vfiy)) K{x, y)n{x) 

x^y xjty 

^4J-ix,(/) 

Now using Lemma 15.li we have 


Wi{f7r,7r) ^ — J2 “ /(y)l^(^>2/)^(®) 

/ 

x^y 

= lv^(®) “ \//(y)l (\/7(a:) + V7(y)) K{x,y)7r{x) 


^-,/T;{f) 

Kc 

tic 



['/fix) + '/f{y)) K{x,y)n{x) 

x^y 



□ 


6 Applications 

6.1 Transport-information inequalities implies transport-entropy 
inequalities 

We prove here the discrete version of Theorem 2.1 in |19) . The proof is essentially 
unchanged, we give it to justify the validity of the theorem in the discrete setting. 

Theorem 6.1. Assume that the transport-information inequality 

Wi{f7r,7rf ^ -^T^if) 
holds. Then we have the transport-entropy inequality 

Wl{fTT,TTf < |;EnW(/). 

Proof. The transport-entropy inequality Ti(C) is equivalent to the estimate 

e'^'^dvr ^ exp 

for all 1-Lipschitz function / with f /dvr = 0 and all A ^ 0. Let / be such a 
function. Let Z(\) := f dir and yx ■= e^^ 'kIZ{X). We have 
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Using the inequality fT{f)dTT ^ f f^r(log f)d7r, we deduce 

which integrates into logZ(A) ^ A^/(2C), and this is the bound we were looking 
for. □ 

We shall now show that the weak transport-information inequality I implies 
the weak-transport-entropy inequality H. The proof is an adaptation of the one 
for the T 2 and T 2 I inequalities in the continuous setting from |19) . 


Theorem 6.2. Assume that vr satisfies the modified weak-transport information 
inequality r^/(C), then vr satisfies the weak-transport inequality T^HiC). 

Proof. According to |16) . the transport-entropy inequality T 2 is equivalent to 

^ dTT ^ exp j fd-K 

Usually, the class of functions / we must use is the class of bounded continuous func¬ 
tions, but here, since we work on a discrete space endowed with a graph distance, 
we only have to work with bounded functions. 

We write F{t) := log f exp(k(t)Qtf)dTr — k{t) f fdir with k{t) := Ct. Let 
pLt be the probability measure with density with respect to vr proportional to 
exp{k{t)Qtf). 

According to part (iv) of Lemma 12.31 we have 


^ V/ : A —> M bounded. 



j ^ j\Vk{t)Qtf\^d^it 

Hence, for t > 0, we have 


^ 7- ( [ k'mtfe’^^^^^^fdTT- [ kit)\VQtf\^e’^^^^^^fd7r) -kft) [ fdir 

J exp{k{t)Qtf)dTr \J J J J 

^ ^ Qiitf)dt^t - j tfdn^ - kit) j\VQtf\‘^d^it 

^t^W2ilit,7r)^-kit) I\VQtf\^d^^t 

I I\^Qtf\^dpt 

= 0 


□ 


6.2 Transport-information inequalities imply diameter bounds 

We now show that transport-information inequalities imply diameter bounds, in 
the spirit of the Bonnet-Meyer theorem of |34) . 
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Theorem 6.3 (Diameter estimate). Assume that the transport-information inequal¬ 
ity 

M^i,d(/7r,vr)2 < 

holds, for some distance d. Then 


sup d{x,y) ^ — 

x,y£X ^ 



Proof. Let fx be the density function correspond to the Dirac mass Sx- fx '■= 


I '^{^/f^)d^^ = i ('^{^/fJx))‘^k{x, z)Tr{x) + x)7r(z) 

\z'^X Z'^X / 

= fx{x)fk{x, z)7r{x) ^ J{x) 


Thus, for all x G T, Tn{fx) ^ 4J(x), it follows that LLi(/a;7r, vr)^ ^ ^J(x). As a 
consequence, for all x, y G A, it holds 

d{x,y) = Wi{5x,dy) ^ Wi{fx'n-,7r) + Wi{fyTr,7r) < ^ (\/J(x) + VJ{y)^ ■ 

□ 


According to Corollary 13.21 we have the following corollary: 

Corollary 6.4. Assume that CD{k,oo) holds, then for all x,y G A, 

dr{x, y)K ^ 2 V 2 (^VJ(x) + \/J{y)^ ■ 

Together with ([3|, we can recover Corollarv ll.lll 

If we look at the example of the discrete hypercube of dimension N (see the 
next subsection), with our notations it satishes CD(1/A^, 00 ). The above theorem 
gives the correct bound on the diameter for the graph distance of N. 

6.3 An example: the discrete hypercube 

As an example of Markov chain satisfying CDE’(k, 00 ), we study the example of 
the symmetric random walk on the discrete hypercube. It is a Markov chain on 
{0,1}'^, which at rate 1 selects a coordinate uniformly at random, and flips it 
with probability 1/2. The transition rates are K{x,y) = 1/(2A) for x,y such that 
dg{x,y) = 1, and else it is 0. 

Theorem 6.5. The symmetrie random walk on the discrete hynercube satisfies 
CDE’{1/N, 00 ) 

Proof. We start with the case N = 1. Since then we only have to consider a 
Markov chain on a two-points space, we can easily do explicit computations. Fix 
/ : {0,1} —)• M. We have 

r(/)(o) = r(/)(i) = i(/(o)-/(i))2 
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and hence T ^/, = 0 and 

G2(/) = r2(/) = -r(/,L/) = r(/). 

Therefore when N = 1, the Markov chain satisfies CDE’(1, oo). 

The general case follows, using a tensorization argument. In the unnormalized 
case, using Proposition 3.3 of j6], the graph satisfies CDE’(1, oo) independently of 
N. Since we consider the case of a Markov chain and enforce dU), we rescale the 
generator by a factor 1/N (so that there is on average one jump by unit of time), 
and therefore it satisfies CDE’(1/A^, oo). □ 

Remark 6.1. We have shown that for the two-point space, the exponential curvature 
and the curvature are the same, and equal tol. In it is stated that the curvature 
is 2. The difference is because, since we enforced the normalization condition O; 
the definitions of L in the two frameworks differ by a factor 2. 
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